Waste is a significant global concern, with the production of waste increasing due to the growing global population and rising living standards. There is an increasing concern among individuals regarding waste generation and its consequences, leading them to seek solutions to address this issue.
Recycling involves transforming waste materials into new materials and items. This process often includes the recovery of energy from waste materials. Recycling serves as an alternative to traditional waste disposal methods and has the potential to conserve resources and reduce greenhouse gas emissions. By preventing the waste of potentially valuable materials and decreasing the need for new raw materials, recycling helps to minimize energy use and pollution.
The aim of this task is to create a project that can identify the type of waste (Recyclable or Organic) and sort it into specific categories. By developing this intelligent Image classification system, our objective is to decrease manual labor and efficiently separate the waste into different groups. A dataset from Kaggle Website containing Recyclable and Organic items will be utilized for training and testing the model.
In order to address the issue of waste object classification in image classification using deep learning, it is necessary to obtain an appropriate dataset. An appropriate dataset for this purpose is the Waste Classification dataset from Kaggle. This dataset consists of labeled images depicting waste objects that have been categorized into two groups: Reyclable and Organic.
What dataset can you use to develop a deep learning solution?
The dataset comprises of 22500 pictures of objects categorized as either 'organic' or 'recyclable'. The dataset has been divided into two parts, with 85% being allocated for training data and 15% for test data. The training data consists of 22564 images, whereas the test data has 2513 images. The target classes are defined as 'Organic' and 'Recyclable' based on the objects' predictions. This Waste classification dataset is suitable for training a deep learning model like a convolutional neural network (CNN) to accurately classify waste objects into their respective categories.
How many images do you need? How many for training? How many for testing?
The quantity of images necessary for training an image classification model relies on various factors, including the difficulty of the task, the diversity of the data, and the size and structure of the model. Generally, larger datasets have a tendency to yield improved outcomes, although this entails higher computational demands and longer training durations.
To train an image classification deep learning model, it is typically advised to have a dataset consisting of several thousand images. For instance, the Kaggle dataset called Waste Classification data comprises more than 22,500 images, which is considered an adequate amount for training a waste object classification model.
In general, it is recommended to split the dataset into training and testing sets. Most of the data, approximately 85% of the dataset, should be allocated for training and validating the model, while the rest of the data should be reserved for testing."
We should also include a validation set alongside the training and testing sets to adjust the hyperparameters of the model. The purpose of this validation set is to assess the model's performance on unseen data during training and to avoid overfitting.
Do you need to label the images yourself?
To train an image classification model, it is necessary to ensure that the images are annotated with their respective class labels. The process of annotating the images consists of assigning a particular category or class to each image within the dataset.
However in our case the Waste Classification dataset, already has labeled images, which can be beneficial in terms of time and effort during data preparation.
How do you determine if your model is good enough?
There are multiple methods available to ascertain the performance of an image classification model, which include:
# Import libraries for data processing and visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
# Suppress warnings
import warnings
# Import scikit-learn for data splitting, evaluation, and loading files
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
# Import TensorFlow and Keras for deep learning
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import (
Conv2D, Activation, Dropout, BatchNormalization,
MaxPooling2D, Flatten, Dense
)
from tensorflow.keras.models import Sequential, Model, load_model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import backend as K
from tensorflow.keras.utils import plot_model
from glob import glob
from google.colab import files
uploaded = files.upload()
Saving new_test.zip to new_test.zip Saving TEST.zip to TEST.zip Saving TRAIN.zip to TRAIN.zip
import zipfile
import os
train_zip_path = 'TRAIN.zip'
test_zip_path = 'TEST.zip'
new_test_zip_path = 'new_test.zip'
train_extract_dir = '/content/TRAIN'
test_extract_dir = '/content/TEST'
new_test_extract_dir = '/content/new_test'
# Extract the TRAIN.zip file
with zipfile.ZipFile(train_zip_path, 'r') as zip_ref:
zip_ref.extractall(train_extract_dir)
# Extract the TEST.zip file
with zipfile.ZipFile(test_zip_path, 'r') as zip_ref:
zip_ref.extractall(test_extract_dir)
# Extract the new_test.zip file
with zipfile.ZipFile(new_test_zip_path, 'r') as zip_ref:
zip_ref.extractall(new_test_extract_dir)
#Loading train and test data
train_set = '/content/TRAIN/TRAIN/'
test_set = '/content/TEST/TEST/'
train_data_generation = ImageDataGenerator(rescale=1./255, validation_split=0.2)
test_data_generation = ImageDataGenerator(rescale=1./255)
# Create a train dataset generator
train_data = train_data_generation.flow_from_directory(train_set, class_mode='categorical', batch_size = 128,
target_size=(224,224), subset='training', color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 18052 images belonging to 2 classes.
# Create a train dataset generator
valid_data = train_data_generation.flow_from_directory(train_set, class_mode='categorical', batch_size = 128,
target_size=(224,224), subset='validation', color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 4512 images belonging to 2 classes.
# Create a train dataset generator
test_data = test_data_generation.flow_from_directory(test_set, class_mode='categorical', batch_size = 128,
target_size=(224,224), shuffle=False,color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 2513 images belonging to 2 classes.
label = (train_data.class_indices)
label = dict((v,k) for k,v in label.items())
print(label)
{0: 'Organic', 1: 'Recycle'}
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(224, 224, 3)))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D())
model.add(Conv2D(64, (3, 3)))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D())
model.add(Conv2D(128, (3, 3)))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(MaxPooling2D())
model.add(Flatten())
model.add(Dense(256))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(64))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Dense(2)) # Output
model.add(Activation("sigmoid"))
plot_model(model,show_shapes=True)
Here's a brief description of its structure:
The first layer has 32 filters of size (3, 3) and expects input images of size (224, 224, 3). It uses the ReLU activation function and includes batch normalization. Max-pooling is applied afterward. The second layer has 64 filters of size (3, 3) with ReLU activation, batch normalization, and max-pooling. The third layer consists of 128 filters of size (3, 3) with ReLU activation, batch normalization, and max-pooling.
After the convolutional layers, the data is flattened into a 1D vector to prepare it for fully connected layers.
A dense layer with 256 units and ReLU activation, followed by dropout with a rate of 0.5. Another dense layer with 64 units, ReLU activation, and dropout with a rate of 0.5.
The final layer consists of 2 units, indicating binary classification (e.g., classifying between two classes). It uses the sigmoid activation function to produce probability scores for each class.1
model.compile(loss='categorical_crossentropy', optimizer= 'adam', metrics= 'accuracy')
hist = model.fit_generator(generator = train_data, epochs=10, validation_data = valid_data)
<ipython-input-42-5589d2f90313>:1: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators. hist = model.fit_generator(generator = train_data, epochs=10, validation_data = valid_data)
Epoch 1/10 142/142 [==============================] - 54s 381ms/step - loss: 2.6329 - accuracy: 0.7205 - val_loss: 1.3309 - val_accuracy: 0.6753 Epoch 2/10 142/142 [==============================] - 55s 383ms/step - loss: 0.5935 - accuracy: 0.7750 - val_loss: 0.5878 - val_accuracy: 0.8054 Epoch 3/10 142/142 [==============================] - 55s 387ms/step - loss: 0.4845 - accuracy: 0.8048 - val_loss: 0.5695 - val_accuracy: 0.7788 Epoch 4/10 142/142 [==============================] - 54s 382ms/step - loss: 0.4553 - accuracy: 0.8197 - val_loss: 0.5363 - val_accuracy: 0.7465 Epoch 5/10 142/142 [==============================] - 53s 376ms/step - loss: 0.4410 - accuracy: 0.8233 - val_loss: 0.4333 - val_accuracy: 0.7974 Epoch 6/10 142/142 [==============================] - 56s 392ms/step - loss: 0.4476 - accuracy: 0.8092 - val_loss: 0.5733 - val_accuracy: 0.7640 Epoch 7/10 142/142 [==============================] - 54s 380ms/step - loss: 0.4496 - accuracy: 0.8105 - val_loss: 0.4187 - val_accuracy: 0.8083 Epoch 8/10 142/142 [==============================] - 54s 379ms/step - loss: 0.4012 - accuracy: 0.8412 - val_loss: 0.4654 - val_accuracy: 0.7431 Epoch 9/10 142/142 [==============================] - 53s 373ms/step - loss: 0.3900 - accuracy: 0.8464 - val_loss: 0.3743 - val_accuracy: 0.8469 Epoch 10/10 142/142 [==============================] - 55s 389ms/step - loss: 0.3586 - accuracy: 0.8584 - val_loss: 0.6308 - val_accuracy: 0.6953
plt.figure(figsize=[10, 6])
plt.plot(hist.history["accuracy"], label="Train Accuracy")
plt.plot(hist.history["val_accuracy"], label="Validation Accuracy")
plt.legend()
# Adding a title to the plot
plt.title("Training and Validation Accuracy")
plt.show()
plt.figure(figsize=(10,6))
plt.plot(hist.history['loss'], label = "Train loss")
plt.plot(hist.history["val_loss"], label = "Validation loss")
plt.legend()
plt.show()
evaluation_score = model.evaluate(test_data)
print('Test Loss:', evaluation_score[0])
print('Test Accuracy:', evaluation_score[1])
20/20 [==============================] - 7s 327ms/step - loss: 0.6119 - accuracy: 0.6860 Test Loss: 0.6119285225868225 Test Accuracy: 0.6860326528549194
test_x, test_y = valid_data.__getitem__(1)
preds = model.predict(test_x)
plt.figure(figsize=(16, 16))
for i in range(16):
plt.subplot(4, 4, i+1)
plt.title('This Item is %s' % (label[np.argmax(preds[i])]))
plt.axis('off')
plt.imshow(test_x[i])
4/4 [==============================] - 0s 20ms/step
from sklearn.metrics import classification_report
true = np.argmax(test_y, axis=1)
pred = np.argmax(preds, axis=1)
print(classification_report(true, pred))
precision recall f1-score support
0 0.63 0.96 0.76 76
1 0.75 0.17 0.28 52
accuracy 0.64 128
macro avg 0.69 0.57 0.52 128
weighted avg 0.68 0.64 0.57 128
from sklearn.metrics import ConfusionMatrixDisplay
ConfusionMatrixDisplay.from_predictions(true,pred)
plt.show()
Build a data preprocessing pipeline to perform data augmentation. (You may use Keras ImageDataGenerator or write your own transformations.)
Report the model performance with the pipeline added. How much performance gain have you achieved?
Profile your input pipeline to identify the most time-consuming operation. What actions have you taken to address that slow operation? (Hint: You may use a profiler such as the TensorFlow Profiler.)
You may notice that with your pipeline, the model performance improves, but at the cost of a longer training time per epoch. Is the additional training time well spent? Compare the dynamic of model performance (e.g., classification accuracy on the test data) with and without data augmentation, when equal training time is spent in the two scenarios.
Identify images that are incorrectly classified by your model. Do they share something in common? How do you plan to improve the model's performance on those images?
# Data augmentation for training data
train_data_gen = ImageDataGenerator(rotation_range=40,
rescale = 1./255,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
validation_split=0.2)
# Data augmentation for testing data
test_data_gen = ImageDataGenerator(rescale=1./255)
# Create a train dataset generator
train_dataset = train_data_gen.flow_from_directory(train_set, class_mode='categorical', batch_size=128,
target_size=(224,224), subset='training', color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 18052 images belonging to 2 classes.
# Create a train dataset generator
valid_dataset = train_data_gen.flow_from_directory(train_set, class_mode='categorical', batch_size=128,
target_size=(224,224), subset='validation', color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 4512 images belonging to 2 classes.
# Create a train dataset generator
test_dataset = test_data_gen.flow_from_directory(test_set, class_mode='categorical', batch_size=128,
target_size=(224,224), shuffle=False,color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 2513 images belonging to 2 classes.
labels = (train_dataset.class_indices)
labels = dict((v,k) for k,v in labels.items())
print(labels)
{0: 'Organic', 1: 'Recycle'}
for batch_data, batch_labels in train_dataset:
print(batch_data.shape)
print(batch_labels.shape)
break
(128, 224, 224, 3) (128, 2)
#Displaying the sample Images
def display_images_with_labels(image_batch, label_batch, class_names):
plt.figure(figsize=(10, 10))
plt.subplots_adjust(wspace=0.8, hspace=0.8)
for i in range(20):
plt.subplot(5, 4, i + 1)
plt.imshow(image_batch[i])
plt.title(class_names[np.argmax(label_batch[i])])
plt.axis('off')
display_images_with_labels(batch_data, batch_labels, ['Organic','Recycle'])
data_model = Sequential()
data_model.add(Conv2D(32, (3, 3), input_shape=(224, 224, 3)))
data_model.add(Activation("relu"))
data_model.add(BatchNormalization())
data_model.add(MaxPooling2D())
data_model.add(Conv2D(64, (3, 3)))
data_model.add(Activation("relu"))
data_model.add(BatchNormalization())
data_model.add(MaxPooling2D())
data_model.add(Conv2D(128, (3, 3)))
data_model.add(Activation("relu"))
data_model.add(BatchNormalization())
data_model.add(MaxPooling2D())
data_model.add(Flatten())
data_model.add(Dense(256))
data_model.add(Activation("relu"))
data_model.add(Dropout(0.5))
data_model.add(Dense(64))
data_model.add(Activation("relu"))
data_model.add(Dropout(0.5))
data_model.add(Dense(2)) # Output
data_model.add(Activation("sigmoid"))
import tensorflow as tf
from datetime import datetime
data_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Create a TensorBoard callback
logs = "logs/" + datetime.now().strftime("%Y%m%d-%H%M%S")
tboard_callback = tf.keras.callbacks.TensorBoard(log_dir = logs,
histogram_freq = 1,
profile_batch = '500,520')
data_model.fit(train_dataset, epochs=10, validation_data=valid_dataset, callbacks=[tboard_callback])
Epoch 1/10 142/142 [==============================] - 327s 2s/step - loss: 0.5943 - accuracy: 0.7663 - val_loss: 2.0810 - val_accuracy: 0.4546 Epoch 2/10 142/142 [==============================] - 308s 2s/step - loss: 0.5113 - accuracy: 0.7780 - val_loss: 0.8917 - val_accuracy: 0.6201 Epoch 3/10 142/142 [==============================] - 307s 2s/step - loss: 0.5039 - accuracy: 0.7837 - val_loss: 0.4587 - val_accuracy: 0.8203 Epoch 4/10 142/142 [==============================] - 293s 2s/step - loss: 0.4779 - accuracy: 0.7825 - val_loss: 0.4953 - val_accuracy: 0.7498 Epoch 5/10 142/142 [==============================] - 285s 2s/step - loss: 0.4826 - accuracy: 0.7908 - val_loss: 0.4686 - val_accuracy: 0.7985 Epoch 6/10 142/142 [==============================] - 286s 2s/step - loss: 0.4666 - accuracy: 0.7993 - val_loss: 0.5832 - val_accuracy: 0.6148 Epoch 7/10 142/142 [==============================] - 287s 2s/step - loss: 0.4578 - accuracy: 0.8031 - val_loss: 0.6156 - val_accuracy: 0.5709 Epoch 8/10 142/142 [==============================] - 287s 2s/step - loss: 0.4370 - accuracy: 0.8168 - val_loss: 0.4243 - val_accuracy: 0.8090 Epoch 9/10 142/142 [==============================] - 315s 2s/step - loss: 0.4874 - accuracy: 0.8043 - val_loss: 0.4886 - val_accuracy: 0.7312 Epoch 10/10 142/142 [==============================] - 287s 2s/step - loss: 0.4494 - accuracy: 0.8094 - val_loss: 0.4108 - val_accuracy: 0.8318
<keras.src.callbacks.History at 0x7a4c344b1b70>
# Load the TensorBoard notebook extension.
%reload_ext tensorboard
# Launch TensorBoard and navigate to the Profile tab to view performance profile
%tensorboard --logdir=logs
Some of the time consuming operations:
Actions taken to address that slow operation:
Evaluating the model
evaluation_scores = data_model.evaluate(test_dataset)
print('Test Loss:', evaluation_scores[0])
print('Test Accuracy:', evaluation_scores[1])
20/20 [==============================] - 10s 516ms/step - loss: 0.3830 - accuracy: 0.8480 Test Loss: 0.3830283582210541 Test Accuracy: 0.8479904532432556
Report the model performance with the pipeline added. How much performance gain have you achieved?
After data augmentation, the loss for the model reduced to 0.38 from 0.61 and accuracy increased from 0.68 to 0.84
# Saving the model
data_model.save('model.h5')
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3000: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`.
saving_api.save_model(
import pickle
# Save the training history
training_history = data_model.history
# Save the history to a pickle file
with open("trained_cnn_history.pickle", "wb") as pickle_file:
pickle.dump(training_history, pickle_file)
# Load the saved training history
loaded_history = None
with open("trained_cnn_history.pickle", "rb") as pickle_file:
loaded_history = pickle.load(pickle_file)
import time
# Train the model with data augmentation
start_time = time.time()
train_time = 300 # Training time limit in seconds
while time.time() - start_time < train_time:
history_1 = data_model.fit(train_dataset, validation_data=valid_dataset, epochs=2,verbose =0)
# Train the model without data augmentation
start_time = time.time()
train_time = 300
while time.time() - start_time < train_time:
history_2 = model.fit(train_data, validation_data = valid_data, epochs=2, verbose=0)
history_1_dict = history_1.history
train_loss_1 = history_1_dict['loss']
train_acc_1 = history_1_dict['accuracy']
val_acc_1 = history_1_dict['val_accuracy']
val_loss_1 = history_1_dict['val_loss']
history_2_dict = history_2.history
train_loss_2 = history_2_dict['loss']
train_acc_2 = history_2_dict['accuracy']
val_acc_2 = history_2_dict['val_accuracy']
val_loss_2 = history_2_dict['val_loss']
epochs = range(1, len(train_loss_1) + 1)
metrics_df_1 = pd.DataFrame({
'Epochs': epochs,
'Training Loss': train_loss_1,
'Training Accuracy': train_acc_1,
'Validation Accuracy': val_acc_1,
'Validation Loss': val_loss_1,
})
metrics_df_2 = pd.DataFrame({
'Epochs': epochs,
'Training Loss': train_loss_2,
'Training Accuracy': train_acc_2,
'Validation Accuracy': val_acc_2,
'Validation Loss': val_loss_2,
})
metrics_df = pd.concat([metrics_df_1, metrics_df_2], axis=0, ignore_index=True)
print(metrics_df.to_string(index=False))
Epochs Training Loss Training Accuracy Validation Accuracy Validation Loss
1 0.469055 0.803900 0.687500 0.789808
2 0.426174 0.825172 0.831560 0.462159
1 0.308169 0.877964 0.832225 0.392471
2 0.301220 0.884833 0.782358 0.432353
import numpy as np
import matplotlib.pyplot as plt
test_x, test_y = valid_dataset.__getitem__(1)
prediction = data_model.predict(test_x)
incorrectly_classified_indices = []
for i in range(len(prediction)):
if np.argmax(prediction[i]) != np.argmax(test_y[i]):
incorrectly_classified_indices.append(i)
plt.figure(figsize=(16, 16))
for i in range(16):
idx = incorrectly_classified_indices[i]
plt.subplot(4, 4, i + 1)
plt.title('Predicted:%s / Truth:%s' % (labels[np.argmax(prediction[idx])], labels[np.argmax(test_y[idx])]),
fontsize=14)
plt.imshow(test_x[idx])
plt.axis('off')
plt.tight_layout()
plt.show()
4/4 [==============================] - 0s 24ms/step
A recurrent pattern observed in these images involves objects captured from varying angles and partially covered scenes. The absence of cropping in the training images, where complete object visibility is prevalent, might contribute to misclassifications in these cases.
We can address this issue by employing some strategies like utilizing diverse training data, applying data preprocessing techniques, and implementing adaptive learning approaches
So far, you have used training and test images from the same source (via random data split). Now collect new test images from a different source. For example, you may take some photos yourself if you used downloaded images before. Otherwise, you may take new photos using a different mobile phone or against a different background.
Show sample images from the original test data and the newly collected test data. In what ways are they different?
Feed the new test data into your model. Report the performance change.
Improve your model so that it generalises better on unseen test images.
You need to include sufficient analysis to demonstrate that:
new_test_set = '/content/new_test/new_test'
newtest_dataset = test_data_gen.flow_from_directory(new_test_set, class_mode='categorical', batch_size=128,
target_size=(224,224), shuffle=False,color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 21 images belonging to 2 classes.
for batch_data, batch_labels in newtest_dataset:
print(batch_data.shape)
print(batch_labels.shape)
break
(21, 224, 224, 3) (21, 2)
label = (train_data.class_indices)
label = dict((v,k) for k,v in label.items())
print(label)
{0: 'Organic', 1: 'Recycle'}
#Displaying the sample Images
def display_images_with_labels(image_batch, label_batch, class_names):
plt.figure(figsize=(10, 10))
plt.subplots_adjust(wspace=0.8, hspace=0.8)
for i in range(20):
plt.subplot(5, 4, i + 1)
plt.imshow(image_batch[i])
plt.title(class_names[np.argmax(label_batch[i])])
plt.axis('off')
display_images_with_labels(batch_data, batch_labels, ['Organic','Recycle'])
# Model evaluation of New Test Data
score = data_model.evaluate(newtest_dataset)
print('Test Loss', score[0])
print('Test accuracy', score[1])
1/1 [==============================] - 1s 956ms/step - loss: 0.6775 - accuracy: 0.5238 Test Loss 0.6774933338165283 Test accuracy 0.523809552192688
#Visualizing the test results
testx, testy = newtest_dataset.__getitem__(0)
predictions = data_model.predict(testx)
plt.figure(figsize=(20, 20))
for i in range(20):
plt.subplot(5, 4, i+1)
plt.title('This Item is %s' % (labels[np.argmax(predictions[i])]),fontsize=14)
plt.axis('off')
plt.imshow(testx[i])
1/1 [==============================] - 0s 128ms/step
from sklearn.metrics import classification_report
true_values = np.argmax(testy, axis=1)
pred_values = np.argmax(predictions, axis=1)
print(classification_report(true_values, pred_values))
precision recall f1-score support
0 0.44 0.40 0.42 10
1 0.50 0.55 0.52 11
accuracy 0.48 21
macro avg 0.47 0.47 0.47 21
weighted avg 0.47 0.48 0.47 21
Improving the model
Using a pretrained RESNET50 Architecture
We are adopting the ResNet-50 architecture, a convolutional neural network consisting of 50 layers. We will employ a pre-trained variant of this network, which has been trained on a vast dataset from the ImageNet database. This pre-trained network is capable of categorizing images into 1000 different object categories and in our case we will change the output layer to clasisfy into two objects. This choice of architecture is anticipated to lead to a decrease in loss and an improvement in the accuracy of our model.
from tensorflow.keras.applications.resnet50 import ResNet50
resnet_model = ResNet50(input_shape=[224,224] + [3], weights='imagenet', include_top=False)
flatten = Flatten()(resnet_model.output)
predict = Dense(2, activation='sigmoid')(flatten) # Number of Classes = 2
new_data_model = Model(inputs=resnet_model.input, outputs=predict)
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5 94765736/94765736 [==============================] - 1s 0us/step
# Changes in Data Augmentation to Improve the model
new_train_data_gen = ImageDataGenerator(rotation_range=40,
rescale = 1./255,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
validation_split=0.2,
channel_shift_range=20,
fill_mode='nearest')
new_train_dataset = new_train_data_gen.flow_from_directory(train_set, class_mode='categorical', batch_size=128,
target_size=(224,224), subset='training', color_mode= "rgb",
classes=['Organic', 'Recycle'])
Found 18052 images belonging to 2 classes.
from tensorflow.keras.optimizers import Adam
adam_optimizer = Adam(learning_rate=0.002) #Changed the Learning Rate
new_data_model.compile(optimizer= adam_optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
#fit the model
new_data_model_hist = new_data_model.fit_generator(new_train_dataset,
epochs= 10,
validation_data= valid_dataset,
steps_per_epoch=len(new_train_dataset),
validation_steps=len(valid_dataset))
<ipython-input-29-83365627a7e0>:2: UserWarning: `Model.fit_generator` is deprecated and will be removed in a future version. Please use `Model.fit`, which supports generators. new_data_model_hist = new_data_model.fit_generator(new_train_dataset,
Epoch 1/10 142/142 [==============================] - 432s 3s/step - loss: 5.6316 - accuracy: 0.7178 - val_loss: 0.7345 - val_accuracy: 0.4430 Epoch 2/10 142/142 [==============================] - 362s 3s/step - loss: 3.4398 - accuracy: 0.7826 - val_loss: 0.7904 - val_accuracy: 0.4430 Epoch 3/10 142/142 [==============================] - 375s 3s/step - loss: 1.2947 - accuracy: 0.7995 - val_loss: 0.8248 - val_accuracy: 0.4430 Epoch 4/10 142/142 [==============================] - 366s 3s/step - loss: 0.7986 - accuracy: 0.8077 - val_loss: 0.8157 - val_accuracy: 0.4501 Epoch 5/10 142/142 [==============================] - 363s 3s/step - loss: 0.6184 - accuracy: 0.8205 - val_loss: 0.5336 - val_accuracy: 0.7183 Epoch 6/10 142/142 [==============================] - 385s 3s/step - loss: 0.6317 - accuracy: 0.8257 - val_loss: 0.7410 - val_accuracy: 0.5572 Epoch 7/10 142/142 [==============================] - 363s 3s/step - loss: 0.8074 - accuracy: 0.8278 - val_loss: 0.4966 - val_accuracy: 0.7509 Epoch 8/10 142/142 [==============================] - 367s 3s/step - loss: 0.6832 - accuracy: 0.8259 - val_loss: 1.6546 - val_accuracy: 0.7841 Epoch 9/10 142/142 [==============================] - 385s 3s/step - loss: 0.5187 - accuracy: 0.8304 - val_loss: 0.4848 - val_accuracy: 0.7691 Epoch 10/10 142/142 [==============================] - 363s 3s/step - loss: 0.4730 - accuracy: 0.8362 - val_loss: 0.7254 - val_accuracy: 0.6254
It can be inferred that employing this RESNET50 model allows for the attainment of higher accuracy over an extended number of epochs, considering the already accomplished 83% accuracy and 0.4730 loss, in just 10 epochs which is notably impressive. The ample dataset size contributes to preventing overfitting of the model even with a limited number of epochs, leading to effective generalization.
Earlier in this unit, we experimented with different training configurations, such as the learning rate and the minibatch size, to see how they affect learning convergence and generalisation on test data. In this research task, you will explore potential explanations for those effects.
Reproduce experiments described in the paper. Compare the results you obtained with the ones in the paper. Do you identify any discrepancies?
Expand the experiments with different datasets and different models. Do you obtain consistent results? How do the results change?
What connections do you discover between the paper and what you have learnt in the unit?
(For HD task only) In addition to short answers to the above questions, submit a short (less than 5 minutes) video presentation for your analysis and main conclusions.
END OF ASSIGNMENT TWO